Analytic Filter Bank for Speech Analysis, Feature Extraction and Perceptual Studies
نویسنده
چکیده
Speech signal consists of events in time and frequency, and therefore its analysis with high-resolution time-frequency tools is often of importance. Analytic filter bank provides a simple, fast, and flexible method to construct time-frequency representations of signals. Its parameters can be easily adapted to different situations from uniform to any auditory frequency scale, or even to a focused resolution. Since the Hilbert magnitude values of the channels are obtained at every sample, it provides a practical tool for a high-resolution timefrequency analysis. The present study describes the basic theory of analytic filters and tests their main properties. Applications of analytic filter bank to different speech analysis tasks including pitch period estimation and pitch synchronous analysis of formant frequencies and bandwidths are demonstrated. In addition, a new feature vector called group delay vector is introduced. It is shown that this representation provides comparable, or even better results, than those obtained by spectral magnitude feature vectors in the analysis and classification of vowels. The implications of this observation are discussed also from the speech perception point of view.
منابع مشابه
Data Driven Design of Filter Bank for Speech Recognition
Filter bank approach is commonly used in feature extraction phase of speech recognition (e.g. Mel frequency cepstral coefficients). Filter bank is applied for modification of magnitude spectrum according to physiological and psychological findings. However, since mechanism of human auditory system is not fully understood, the optimal filter bank parameters are not known. This work presents a me...
متن کاملData-Driven Filter-Bank-based Feature Extraction for Speech Recognition
Selecting good feature is especially important to achieve high speech recognition accuracy. Although the mel-cepstrum is a popular and effective feature for speech recognition, it is still unclear that the filter-bank in the mel-cepstrum is always optimal regardless of speech recognition environments or the characteristics of specific speech data. In this paper, we focus on the data-driven filt...
متن کاملFeature Extraction with Combination of HMT-Based Denoising and Weighted Filter Bank Analysis for Robust Speech Recognition
In this paper, we propose a new feature extraction method that combines both HMT-based denoising and weighted filter bank analysis for robust speech recognition. The proposed method is made up of two stages in cascade. The first stage is denoising process based on the wavelet domain Hidden Markov Tree model, and the second one is the filter bank analysis with weighting coefficients obtained fro...
متن کاملIntegrated Phoneme Subspace Method for Speech Feature Extraction
Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA), independent component analysis (ICA), and linear discriminant analysis (LDA). Furthermore, this pa...
متن کاملImproving the filter bank of a classic speech feature extraction algorithm
The most popular speech feature extractor used in automatic speech recognition (ASR) systems today is the mel frequency cepstral coefficient (mfcc) algorithm. Introduced in 1980, the filter bank-based algorithm eventually replaced linear prediction cepstral coefficients (lpcc) as the premier front end, primarily because of mfcc’s superior robustness to additive noise. However, mfcc does not app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017